Recognizing Typeset Documents using Walsh Transformation
نویسندگان
چکیده
In this paperwe present an effective character recognition algorithm, which can be applied mainly to typeset documents. Our aim was to compose a character recognition algorithm, which can be used to recognize simple typeset documents in a fast and reliable way. To get a good result by this algorithm the input text document should contain characters from the same character set with a small number of symbols. This condition does not mean a strong restriction as the documents in practice usually have this property. The main character recognition part of the algorithm is based on the Walsh transformation, which gives a verbose description about the image, like the symmetrical relations, placement of the foreground and background pixels, and so on. That is why we tried to apply it to recognize characters, and the algorithm proved to be fairly efficient and reliable for simple documents, since the feature vectors extracted by Walsh transformation can be well distinguished. Moreover, our method had very good results in tolerating different types of noise corruption.
منابع مشابه
An Algorithm Using Walsh Transformation for Compressing Typeset Documents
In this paper the authors present an algorithm which can be used for compressing text documents, principally. The algorithm allows some loss of information, but the original digital image is compressed in a rather efficient way, so the result compressed data structure is suitable to be transmitted through some kind of telecommunication channel. The original document is assumed not to contain so...
متن کاملZernike moments and neural networks for recognition of isolated Arabic characters
The aim of this work is to present a system for recognizing isolated Arabic printed characters. This system goes through several stages: preprocessing, feature extraction and classification. Zernike moments, invariant moments and Walsh transformation are used to calculate the features. The classification is based on multilayer neural networks. A recognition rate of 98% is achieved by using Zern...
متن کاملEvaluation of Psychometric Properties of Walsh Family Resilience Questionnaire
Background: Considering the importance of family resilience and the broader range of applications that focus on the resilience of families, in the current study, the introduction of resilience structure of the family has been identified as an essential research demand. Therefore, consideration of the psychometric properties of the most widely used tools in this area, including family resilience...
متن کاملAutomatic Generation of Character Groundtruth for Scanned Documents: A Closed-Loop Approach - Pattern Recognition, 1996., Proceedings of the 13th International Conference on
Character groundtruth for scanned document images as crucial for evaluating the performance of OCR systems, training OCR algorithms, and validating document degradation models. Unfortunately, manual collection of accurate groundtruth for characters in a real (scanned) document image is not possible because (a) accuracy an delineating groundtruth character bounding boxes is not high enough, (ii)...
متن کاملAn Automatic Closed-Loop Methodology for Generating Character Groundtruth for Scanned Documents
Character groundtruth for real, scanned document images is crucial for evaluating the performance of OCR systems, training OCR algorithms, and validating document degradation models. Unfortunately, manual collection of accurate groundtruth for characters in a real (scanned) document image is not practical because (i) accuracy in delineating groundtruth character bounding boxes is not high enoug...
متن کامل